Prior to publishing analysis and interpretation of water quality data, we will ensure that all data that meets QA/QC standards outlined in the current project Quality Assurance Project Plan (QAPP)and is accessible in the appropriate public repository.
Water quality data from this project is ultimately destined for the Environmental Protection Agency’s Water Quality Exchange (EPA WQX).
The QAPP for this project describes data management details and responsible parties for each step of the data pipeline from observation to repository. The 2023 data preparation and review process is published here.
Year 2023 Water Quality Data
In this chapter we will collate 2023 laboratory data from several sources into a single spreadsheet document with a consistent format. The desired end format is a spreadsheet template provided by the EPA Water Quality Exchange. These template files are available to download from the EPA at https://www.epa.gov/waterdata/water-quality-exchange-web-template-files.
Once the data is collated, it will be evaluated according to a Quality Assurance Checklist (template example provided by the Alaska Department of Environmental Conservation Soldotna office). Field observations that do not meet the quality assurance standards described in the evaluation checklist will be flagged as “Rejected” before being uploaded to the EPA WQX.
Data that has been uploaded to the EPA WQX is evaluated biannually by the Alaska Department of Environmental Conservation (ADEC) in their Integrated Water Quality Monitoring and Assessment Report1. The integrated report evaluates available water quality data from the previous five years against Alaska water quality standards and regulations (adec2020?).
2023 Water Quality Data Formatting
The code scripts in this document assemble water quality data from the three analytical laboratories that partnered with Kenai Watershed Forum for this project in 2023:
# 2023 QA/QC {.unnumbered}---execute: echo: falsedate: "`r Sys.Date()`"format: html: code-fold: true code-tools: true code-summary: "Show the code"---## IntroductionPrior to publishing analysis and interpretation of water quality data, we will ensure that all data that meets QA/QC standards outlined in the current project [Quality Assurance Project Plan (QAPP)](https://www.kenaiwatershed.org/news-media/qapp-revisions-completed-2023/)and is accessible in the appropriate public repository.Water quality data from this project is ultimately destined for the Environmental Protection Agency's Water Quality Exchange (EPA WQX).The QAPP for this project describes data management details and responsible parties for each step of the data pipeline from observation to repository. The 2023 data preparation and review process is published here.### Year 2023 Water Quality DataIn this chapter we will collate 2023 laboratory data from several sources into a single spreadsheet document with a consistent format. The desired end format is a spreadsheet template provided by the EPA Water Quality Exchange. These template files are available to download from the EPA at <https://www.epa.gov/waterdata/water-quality-exchange-web-template-files>.Once the data is collated, it will be evaluated according to a Quality Assurance Checklist (template example provided by the Alaska Department of Environmental Conservation Soldotna office). Field observations that do not meet the quality assurance standards described in the evaluation checklist will be flagged as "Rejected" before being uploaded to the EPA WQX.Data that has been uploaded to the EPA WQX is evaluated biannually by the Alaska Department of Environmental Conservation (ADEC) in their [Integrated Water Quality Monitoring and Assessment Report](https://dec.alaska.gov/water/water-quality/integrated-report/)[^2023-1]. The integrated report evaluates available water quality data from the previous five years against Alaska water quality standards and regulations [@adec2020].[^2023-1]: https://dec.alaska.gov/water/water-quality/integrated-report/#### 2023 Water Quality Data FormattingThe code scripts in this document assemble water quality data from the three analytical laboratories that partnered with Kenai Watershed Forum for this project in 2023:- SGS Laboratories (Anchorage, AK)- Soldotna Wastewater Treatment Plant (Soldotna, AK)<br>------------------------------------------------------------------------##### 2023 Metals/Nutrients Lab Results (SGS Labs)```{r, echo = F, message = F}library("xfun")xfun::embed_file('other/input/2023/spring_2023_wqx_data/Data/SGS/Revision 1 SGS Data/1231846_FC5912_Rev1.csv', text ="Download Original Spring 2023 Metals/Nutrients Lab Results from SGS - Electronic Data Delivery file")xfun::embed_file('other/input/2023/spring_2023_wqx_data/Data/SGS/Revision 1 SGS Data/SGS_Spring_2023_data_final.pdf', text ="Download Original Spring 2023 Metals/Nutrients Lab Results from SGS - PDF file")``````{r, echo = F}xfun::embed_file('other/input/2023/summer_2023_wqx_data/SGS/Summer 2023 SGS Agency Baseline.xlsx', text ="Download Original Summer 2023 Metals/Nutrients Lab Results from SGS - Electronic Data Delivery file")xfun::embed_file('other/input/2023/summer_2023_wqx_data/SGS/1233640.pdf', text ="Download Original Summer 2023 Metals/Nutrients Lab Results from SGS - PDF file")xfun::embed_file('other/input/2023/summer_2023_wqx_data/SGS/1233640_COC.pdf', text ="Download Original Summer 2023 Metals/Nutrients Chain of Custody docs from SGS - PDF file")```<br>```{r, 2023 WQX formatting for SGS, echo = F, message = F}#| warning: false#| message: false# clear environmentrm(list=ls())# load packageslibrary(tidyverse)library(readxl)library(openxlsx)library(data.table)library(stringr)library(magrittr)library(janitor)library(hms)library(lubridate)library(anytime)library(stringi)xfun::pkg_load2(c("htmltools", "mime"))# Assign 2023 Field Sample Dates # Spring 2023 sampling datespring21_sample_date <-"5/2/2023"# Summer 2023 Sampling Datesummer21_sample_date <-"7/27/2021"``````{r message = FALSE, echo = F, include = F}#| warning: false#| message: false######################################################################################################################################################### Read in and Clean SGS Data ################################################################################################################################################################################## Part A: SGS Data Read In ############################### Reformat SGS data downloaded from their server client (SGS Engage, full EDD files) to match AQWMS template# read in# address column type issues such that both dataframes can be mergedspring_batch_sgs23 <-read.csv("other/input/2023/spring_2023_wqx_data/Data/SGS/Revision 1 SGS Data/1231846_FC5912_Rev1.csv") %>%select(-PROJECT_ID,-DISSOLVED)summer_batch_sgs23 <-read_excel("other/input/2023/summer_2023_wqx_data/SGS/Summer 2023 SGS Agency Baseline.xlsx", sheet ="Sheet9") %>%select(-PROJECT_ID,-DISSOLVED) # later, need to address different categorical results that were in removed "DISSOLVED" column:#> unique(spring_batch_sgs23$DISSOLVED)#[1] TRUE#> unique(summer_batch_sgs23$DISSOLVED)#[1] "." "L" "T"# joining and preparatory steps## clean up column names and bind seasons togethersgs23 <-bind_rows(spring_batch_sgs23,summer_batch_sgs23) %>%clean_names() %>%remove_empty() %>%# add lab namemutate(lab_name ="SGS North America, Anchorage, Alaska") %>%# make 'matrix' column consistentselect(-matrix) %>%mutate(matrix ="Water") %>%# prepare separate time and date columns for # - sample collection# - lab receipt# - lab run # - extraction# NOTE summer 2023 lab receipt time/dates missing ?!@transform(collect_date =mdy_hm(collect_date),rec_date =mdy_hm(rec_date),run_date =mdy_hm(run_date),extracted_date =mdy_hm(extracted_date)) %>%separate(collect_date, sep =" ", into =c("collect_date","collect_time")) %>%separate(rec_date, sep =" ", into =c("rec_date","rec_time")) %>%separate(run_date, sep =" ", into =c("run_date","run_time")) %>%separate(extracted_date, sep =" ", into =c("extracted_date","extracted_time"))# NOTE: in some past years, total metals analyses (method 200.7) have been subcontracted to ALS Laboratories (Seattle). In 2023, SGS Anchorage had the 200.7 methods run by SGS Orlando. Thus, these results are already read in here.# remove individual dataframerm(spring_batch_sgs23,summer_batch_sgs23)################### Part B: Create Consistent Sample Location Names ###################### export list of unique sample_idsgs_sitenames <- sgs23 %>%distinct(sample_id,collect_date)colnames(sgs_sitenames) <-c("sample_id","collect_date")sgs_sitenames %>%arrange(sample_id)write.csv(sgs_sitenames,"other/input/2023/misc/site_names.csv", row.names = F)# manually edit site names csv (external to script)# read in manually edited csvsgs_sitenames <-read_csv("other/input/2023/misc/site_names_manual_edit.csv") %>%filter(!is.na(site_id)) %>%# remove trailing spacesmutate(sample_id =str_trim(sample_id, side ="right")) %>%rename(`site_id `= site_id)colnames(sgs_sitenames) <-c("sample_id","site_id")#write.csv(sgs_sitenames, "other/input/2023/misc/site_names_manual_edit.csv", row.names = F)# join site_id to main dataframesgs23 <-left_join(sgs23,sgs_sitenames, by ="sample_id")# join "monitoring location id" to overall sgs23 dataframe# import example data w/ monitoring location IDslocation_ids <-read.csv("other/output/example_output/results_activities_2021.csv") %>%clean_names() %>%select(monitoring_location_id, activity_id) %>%separate(activity_id, sep ="-", into ="site_id") %>%distinct()# joinsgs23 <-left_join(sgs23,location_ids, by ="site_id") # NOTE: the join for several sites are not cooperating.# These sites include "No Name Creek" , "Skilak Lake Outflow," and "Jim's Landing# After investigating possible reasons, (white spaces, etc) no diagnoses is evident. # For time efficiency, we will instead manually rectify this in script with mutate and case_whensgs23 %<>%# No Name Creekmutate(site_id =case_when(grepl("No Name|NO NAME", sample_id) ~"KBL_t_00.0",TRUE~ sample_id)) %>%transform(monitoring_location_id =as.character(monitoring_location_id)) %>%mutate(monitoring_location_id =case_when( site_id =="KBL_t_00.0"~"10000008",TRUE~ monitoring_location_id)) %>%# Skilak Outletmutate(monitoring_location_id =case_when(grepl("Skilak|SKILAK",sample_id) ~"10000030",TRUE~ monitoring_location_id)) %>%# Jim's Landingmutate(monitoring_location_id =case_when(grepl("JIM'S|Jim's",sample_id) ~"10000031",TRUE~ monitoring_location_id))# all 2023 SGS data joined and present.``````{r message = FALSE, echo = F, include = F}#| warning: false#| message: false#################### Part C ################## Ensure information about project sample type is placed in correct column (activity Type)# method blank, trip blank, field duplicatesgs23 %<>%mutate(activity_type =case_when(grepl("DUP","Dup", sample_id) ~"Quality Control Field Replicate Msr/Obs",grepl("FB", sample_id) ~"Quality Control Sample-Field Blank",grepl("TRIP BLANK", sample_id) ~"Quality Control Sample-Trip Blank",TRUE~""))```# general strategy: get all columns of df to conform to final status before joining. (e.g. do all sgs first, then swwtp)##### 2023 Fecal Coliform Lab Results (Soldotna Wastewater Treatment Plant Lab)# FORMULAIC final steps for combined dataframe# activity_id (concatenate)# Result Analytical Method ID (matching)# Result Analytical Method ID (matching)# Result Qualifier (matching, verify symbols; SGS terms vs CDX terms)# Result Sample Fraction (matching)# Method Speciation (matching)# Sample Collection Equipment Name (matching)# Result Sample Fraction (matching)# Result Detection Condition (calculated)# Sample Container Type (matching)# Sample Container Color (matching)# Chemical Preservative Used (matching)# STATIC final steps for combined dataframe# all other columns# QA/QC# Are sample collection dates unique to the actual recorded dates?# (no; spring has two dates; lab typo)